GPU Acceleration for the C++ Standard Template Library

نویسنده

Christian DeLozier

چکیده

Modern programmers must exploit parallelism for performance gains, possibly through the use of an attached or on-chip GPU. To take advantage of the GPU in C++ programs, the programmer must use either a new language (CUDA or OpenCL) or an external library (Thrust). Rather than requiring that programmers learn new tools, modify existing code, and change software development practices, the C++ Standard Template Library (STL) can be modified to automatically accelerate common algorithms using the GPU. This paper presents a GPU accelerated version of the C++ STL, libcxxgpu. Using the thrust library, function calls to the algorithms provided by the C++ STL are executed on the GPU, depending on a set of heuristics that determine when to use the CPU and when to use the GPU. In this paper, we detail the implementation of the accelerated library, highlight challenges encountered, and analyze the performance factors that determine which device should be used. CR Categories: D.1.3 [Programming Techniques]: Concurrent Programming—Parallel Programming D.2.2 [Software Engineering]: Design Tools and Techniques—Software Libraries

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thrust : A Productivity - Oriented Library for CUDA 26

This chapter demonstrates how to leverage the Thrust parallel template library to implement high-performance applications with minimal programming effort. Based on the C++ Standard TemplateLibrary (STL), Thrust brings a familiar high-level interface to the realm of GPU Computing whileremaining fully interoperable with the rest of the CUDA software ecosystem. Applications written...

متن کامل

Multi-Stage Programming for GPUs in Modern C++ using PACXX

Writing and optimizing programs for high performance on systems with GPUs remains a challenging task even for expert programmers. One promising optimization technique is to evaluate parts of the program upfront on the CPU and embed the computed results in the GPU code allowing for more aggressive compiler optimizations. This technique is known as multi-stage programming and has proven to allow ...

متن کامل

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

Gaussian Process Models with Parallelization and GPU acceleration

In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration. The parallelization scheme arises naturally from the modular computational structure w.r.t. datapoints in the sparse Gaussian process formulation. Additionally, the computational bottleneck is implemented with GPU acceleration for further speed up. Combining both techni...

متن کامل

Accelerating QDP++ using GPUs

Graphic Processing Units (GPUs) are getting increasingly important as target architectures in scientific High Performance Computing (HPC). NVIDIA established CUDA as a parallel computing architecture controlling and making use of the compute power of their GPUs. CUDA provides sufficient support for C++ language elements to enable the Expression Template (ET) technique in the device memory domai...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

GPU Acceleration for the C++ Standard Template Library

نویسنده

چکیده

منابع مشابه

Thrust : A Productivity - Oriented Library for CUDA 26

Multi-Stage Programming for GPUs in Modern C++ using PACXX

Accelerating high-order WENO schemes using two heterogeneous GPUs

Gaussian Process Models with Parallelization and GPU acceleration

Accelerating QDP++ using GPUs

عنوان ژورنال:

اشتراک گذاری